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The automated categorization (or classification) of texts into predefined categories has 
witnessed a booming interest in the last 10 years, due to the increased availability of 
documents in digital form and the ensuing need to organize them. In the research 
community the dominant approach to this problem is based on machine learning 
techniques: a general inductive process automatically builds a classifier by learning,, from 
a set of preclassified documents, the characteristics of the categories. ... 
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Spoken dialogue systems allow users to interact with computer-based applications such as 
databases and expert systems by using natural spoken language. The origins of spoken 
dialogue systems can be traced back to Artificial Intelligence research in the 1950s 
concerned with developing conversational interfaces. However, it is only within the last 
decade or so, with major advances in speech technology, that large-scale working systems 
have been developed and, in some cases, introduced into commerc ... 

Keywords: Dialogue management, human computer interaction, language generation, 
language understanding, speech recognition, speech synthesis 



Meaningful term extraction and discriminative term selection in text categorization via 
unknown-word methodology 
Yu-Sheng Lai, Chung-Hsien Wu 

March 2002 ACM Transactions on Asian Language Information Processing (TALIP), 

Volume 1 Issue 1 
Publisher: ACM Press 
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In this article, an approach based on unknown words is proposed for meaningful term 
extraction and discriminative term selection in text categorization. For meaningful term 
extraction, a phrase-like unit (PLU)-based likelihood ratio is proposed to estimate the 
likelihood that a word sequence is an unknown word. On the other hand, a discriminative 
measure is proposed for term selection and is combined with the PLU-based likelihood 
ratio to determine the text category. We conducted several experim ... 
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One problem in computer program testing arises when errors are found and corrected 
after a portion of the tests have run properly. How can it be shown that a fix to one area 
of the code does not adversely affect the execution of another area? What is needed is a 
quantitative method for assuring that new program modifications do not introduce new 
errors into the code. This model considers the retest philosophy that every program 
instruction that could possibly be reached and tested from the ... 
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P. David Stotts, Richard Furuta 

January 1989 ACM Transactions on Information Systems (TOIS), Volume 7 issue l 
Publisher: ACM Press 
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We present a formal definition of the Trellis model of hypertext and describe an authoring 
and browsing prototype called &agr;Trellis that is based on the model. The Trellis model 
not only represents the relationships that tie individual pieces of information together into 
a document (i.e., the adjacencies), but specifies the browsing semantics to be associated 
with the hypertext as well (i.e., the manner in which the information is to be visited and 
presented). The model is based on Petri ... 
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10 Human-computer interface development: concepts and systems for its management 
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Publisher: ACM Press 

Full text available* ISsI pdf{7 97 MB) Additional Information: full citation , abstract , references , citings , index 

terms , review 

Human-computer interface management, from a computer science viewpoint, focuses on 
the process of developing quality human-computer interfaces, including their 
representation, design, implementation, execution, evaluation, and maintenance. This 
survey presents important concepts of interface management: dialogue independence, 
structural modeling, representation, interactive tools, rapid prototyping, development 
methodologies, and control structures. Dialogue independence is th ... 

11 Information storage and retrieval: a survey and functional description 
Jack Minker 

September 1977 ACM SIGIR Forum, Volume 12 issue 2 
Publisher: ACM Press 

Full text available: -^j] pdf(5.14 MB) Additional Information: full citation , abstract , references 

Information Storage and Retrieval (ISS\R) encompasses a broad scope of topics ranging 
from basic techniques for accessing data to sophisticated approaches for the analysis of 
natural language text and the deduction of information. Within the field, three general 
areas*of investigation can be distinguished not only by their subject matter but also by the 
types of individuals presently interested in them:(l) Document retrieval, (2) Generalized 
data management, and(3) Question-answering.A functional ... 

Keywords: automatic indexing, data management, data structures, deductive search, 
information retrieval, natural language, problem solving, question-answering, relational 
data systems, theorem proving 



12 Fast detection of communication patterns in distributed executions 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced 
Studies on Collaborative research 

Publisher: IBM Press 

Full text available: |jj| pdf(4.21 MB) Additional Information: full citation , abstract , references , index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process-time diagrams are often used to obtain a better understanding of the execution 
of the application. The visualization tool we use is Poet, an event tracer developed at the 
University of Waterloo. However, these diagrams are often very complex and do not 
provide the user with the desired overview of the application. In our experience, such tools 
display repeated occurrences of non-trivial commun ... 




13 Subject and classification-code indexes 
February 1973 Proceedings of the 1st annual computer science conference on 

Program information abstracts CWC '73 

Publisher: ACM Press 

Full text available: ^ pdf(3.19 MB) Additional Information: full citation , abstract 

These indexes were prepared by William S. Stalcup, Steven A. Holton and Anthony E. 
Petrarca, Department of Computer and Information Science, The Ohio State University 
with the aid of programs developed by W. Michael Lay as part of his Doctoral research. 
The technique used for production of these indexes is a variation of the Double-KWIC 
Coordinate Indexing Technique , various aspects of which have been described by A. E. 
Petrarca and W. M. Lay in <u>J. Chem. Doc, 9</u>, 256(1969); & ... 

14 Special issue: Al in engineering 
D. Sriram, R. Joobbani 
April 1985 ACM SIGART Bulletin, issue 92 
Publisher: ACM Press 

Full text available: ^ pdf(879 MB) Additional Information: full citation , abstract 

The papers in this special issue were compiled from responses to the announcement in the 
July 1984 issue of the SIGART newsletter and notices posted over the ARPAnet. The 
interest being shown in this area is reflected in the sixty papers received from over six 
countries. About half the papers were received over the computer network. 





15 Technique for automatically correcting words in text 
Karen Kukich 

December 1992 ACM Computing Surveys (CSUR), Volume 24 issue 4 
Publisher: ACM Press 

Full text available- S pdf(6 2 3 MB) Additional Information: full citation , abstract , references , citings , index 
^ terms , review 

Research aimed at correcting words in text has focused on three progressively more 
difficult problems:(l) nonword error detection; (2) isolated-word error correction; and (3) 
context-dependent work correction. In response to the first problem, efficient pattern- 
matching and n-gram analysis techniques have been developed for detecting strings that 
do not appear in a given word list. In response to the second problem, a variety of general 
and application-specific spelling cor ... 

Keywords: n-gram analysis, Optical Character Recognition (OCR), context-dependent 
spelling correction, grammar checking, natural-language-processing models, neural net 
classifiers, spell checking, spelling error detection, spelling error patterns, statistical- 
language models, word recognition and correction 
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17 Modeling for text compression . 

Timothy Bell, Ian H. Witten, John G. Cleary 

December 1989 ACM Computing Surveys (CSUR), volume 21 issue 4 
Publisher: ACM Press 

Full text available: 1 jg)pdf(3.54 MB) Additional Information: full citation , abstract , references , citings , index 
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The best schemes for text compression use large models to help them predict which 
characters will come next. The actual next characters are coded with respect to the 
prediction, resulting in compression of information. Models are best formed adaptively, 




based on the text seen so far. This paper surveys successful strategies for adaptive 
modeling that are suitable for use in practical text compression systems. The strategies 
fall into three main classes: finite-context modeling, i ... 



18 Expressiveness of structured document query languages based on attribute 
^ grammars 

^ Frank Neven, Jan Van Den Bussche 

January 2002 Journal of the ACM (JACM), Volume 49 issue l 

Publisher: ACM Press 



Structured document databases can be naturally viewed as derivation trees of a context- 
free grammar. Under this view, the classical formalism of attribute grammars becomes a 
formalism for structured document query languages. From this perspective, we study the 
expressive power of BAGs: Boolean-valued attribute grammars with propositional logic 
formulas as semantic rules, and RAGs: relation-valued attribute grammars with first-order 
logic formulas as semantic rules. BAGs can express only unary qu ... 

Keywords: Attribute grammars, automata, complexity, logic, structured documents 
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Publisher: Association for Computational Linguistics 

Full text available: ^ pdf(575.43 KB) Additional Information: full citation , abstract , references , citings 

We present a novel approach to parsing phrase grammars based on Eric Brill's notion of 
rule sequences. The basic framework we describe has somewhat less power than a finite- 
state machine, and yet achieves high accuracy on standard phrase parsing tasks. The rule 
language is simple, which makes it easy to write rules. Further, this simplicity enables the 
automatic acquisition of phrase-parsing rules through an error-reduction strategy. 
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